The alignments in this analysis were generated by aligning each library (including technical replicates) to the Zebrafish transcriptome from Ensembl Release 94 (GRCz11) using kallisto (v0.43.1). In addition to the standard transcriptome, the two mutant psen2 transcripts were manually added to the reference.
The corresponding set of gene descriptions were then loaded into R as an EnsDb object using the AnnotationHub() infrastructure. Likewise, the set of transcript descriptions were loaded, with the manual addition of the two novel psen2 mutants.
Gene-level counts were imported using tximport, mapping transcripts to genes. Some genes exist in the primary assembly and on alternate assemblies for specific regions, and these were considered as separate transcripts of the same gene for read summarisation purposes. Transcript counts were thus mapped to genes using the gene symbol (e.g. psen2), instead of the gene id.
Genes were retained for analysis if a CPM > 1 was observed for \(\geq\) 5 samples. This equated to about 31 reads for a gene in at least 5 samples for inclusion in downstream analysis, giving a total of 18,808 of the original genes for DGE analysis.
Total counts from each library after assigning to genes
Counts were also processed using the voom transformation using quality weights to allow for analysis using normal-based algorithms. Sample weights ranged between 0.4185 and 1.392, with the most strongly down-weighted being a WT sample.
Transcript-level counts were imported using catchKallisto() from edgeR in order to utilise the voom transformation on transcript-level counts.
Sample weights using transcript-level counts, showing near identical patterns to those observed at the gene-level.
CPM values for each psen2 transcript across all samples.
Transcript abundances (using CPM) were calculated for each of the three psen2 transcripts, and showed expected patterns of heterozygous expression for FAD samples and all WT expression for the WT samples. However for sample 8_FS_4, no WT allele was detected which is quite inexplicable, and this sample should be excluded from all analyses. The remaining FS samples showed reduced abundance of the FS transcript, as expected under NMD. No increases in expression of the WT allele were evident, supporting a lack of genetic compensation.
This sample was then removed from all objects, along with 12_WT_4 which had been consistently down-weighted.
The next step was to perform an MDS analysis. However, minimal separation was observed between sample groups, A simple PCA also revealed that the first few principal components capture less of the total variability than might be expected,
MDS plot showing no clear groups within the data. Point sizes indicate sample weights as calculated by voomWithQualityWeights().
| Â | PC1 | PC2 | PC3 | PC4 | PC5 |
|---|---|---|---|---|---|
| Standard deviation | 22.34 | 21.15 | 17 | 16.42 | 15.46 |
| Proportion of Variance | 0.1778 | 0.1594 | 0.1029 | 0.09607 | 0.08509 |
| Cumulative Proportion | 0.1778 | 0.3372 | 0.4401 | 0.5362 | 0.6213 |
Three comparisons were defined with the first two being the difference between the two mutant families and the wild-type samples. The third comparison was defined as being between the two mutant groups.
The first analysis was comparing psen2N140fs/+ samples to psen2+/+ samples. A total of 4 genes were potentially detected as differentially expressed using an FDR of 5%. In the following plots, a negative value for logFC corresponds to decreased expression in the heterozygous mutants.
MD plot for psen2N140fs/+ samples compared to psen2+/+ samples
Volcano plot for psen2N140fs/+ samples compared to psen2+/+ samples
| Symbol | logFC | AveExpr | P.Value | FDR |
|---|---|---|---|---|
| psen2 | -0.6211 | 4.591 | 2.827e-08 | 0.0002806 |
| CABZ01035279.1 | -9.689 | 0.4716 | 2.984e-08 | 0.0002806 |
| ptcd1 | 0.8541 | 2.486 | 3.256e-06 | 0.02041 |
| CU179663.1 | -0.9028 | 3.711 | 9.727e-06 | 0.04574 |
| BX649405.1 | -1.045 | 2.21 | 1.712e-05 | 0.06441 |
| atxn1l | -0.6898 | 2.457 | 2.681e-05 | 0.08404 |
| pcnp | 0.5006 | 5.545 | 7.107e-05 | 0.1909 |
| mhc1zea | -0.3211 | 4.509 | 0.000125 | 0.2524 |
| si:ch211-160d14.6 | -0.4693 | 4.582 | 0.000137 | 0.2524 |
| lrrc4ba | -0.4685 | 4.853 | 0.0001431 | 0.2524 |
Expression patterns for significantly DE genes in the comparison between psen2N140fs/+ samples and psen2+/+ samples. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.
Expression patterns for the next most highly ranked genes in the comparison between psen2N140fs/+ samples and psen2+/+ samples, but which are not formally considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.
The next analysis was comparing psen2T141_L142delinsMISLISV/+ samples to psen2+/+ samples. No genes could be considered as DE using an FDR anywhere up to 50%. In the following plots, a negative value for logFC corresponds to decreased expression in the heterozygous mutants.
MD plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2+/+ samples
Volcano plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2+/+ samples
| Symbol | logFC | AveExpr | P.Value | FDR |
|---|---|---|---|---|
| si:ch73-236c18.2 | 1.182 | 2.05 | 4.18e-05 | 0.7862 |
| tnk2a | 0.2508 | 5.591 | 0.0001201 | 0.9657 |
| BX548026.1 | -0.7332 | 0.9484 | 0.0002037 | 0.9657 |
| si:ch211-56a11.2 | 0.9663 | 1.25 | 0.0002287 | 0.9657 |
| stn1 | 0.4968 | 3.078 | 0.0003219 | 0.9657 |
| BX890543.1 | -0.6474 | 2.308 | 0.000329 | 0.9657 |
| si:ch211-15j1.4 | 0.9632 | 4.827 | 0.0004469 | 0.9657 |
| celsr1b | -0.2697 | 5.413 | 0.0004759 | 0.9657 |
| EIF1B | -0.8818 | 7.518 | 0.0005065 | 0.9657 |
| si:ch211-114l13.4 | 1.051 | 2.016 | 0.0006597 | 0.9657 |
Expression patterns for the 5 most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2+/+ samples. None were considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.
The final analysis was comparing psen2T141_L142delinsMISLISV/+ samples to psen2N140fs/+ samples. A total of 6 genes were potentially detected as differentially expressed using an FDR of 5%. In the following plots, a negative value for logFC corresponds to decreased expression in psen2T141_L142delinsMISLISV/+ samples, whilst a positive value for logFC corresponds to increased expression in psen2T141_L142delinsMISLISV/+ samples.
MD plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2N140fs/+ samples
Volcano plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2N140fs/+ samples
| Symbol | logFC | AveExpr | P.Value | FDR |
|---|---|---|---|---|
| psen2 | 0.7093 | 4.591 | 4.096e-09 | 7.703e-05 |
| CABZ01035279.1 | 8.766 | 0.4716 | 8.255e-08 | 0.0007763 |
| CU179663.1 | 0.9466 | 3.711 | 4.696e-06 | 0.02944 |
| si:ch73-236c18.2 | 1.448 | 2.05 | 8.035e-06 | 0.03778 |
| si:ch211-114l13.3 | 2.008 | -0.1774 | 1.133e-05 | 0.04263 |
| si:ch211-114l13.4 | 1.651 | 2.016 | 1.362e-05 | 0.04268 |
| BX649405.1 | 0.9934 | 2.21 | 2.384e-05 | 0.06406 |
| CABZ01084501.2 | 0.6405 | 3.94 | 4.723e-05 | 0.111 |
| mcoln1a | -0.5633 | 4.202 | 5.725e-05 | 0.1196 |
| BX649434.3 | 0.7764 | 2.85 | 8.792e-05 | 0.1654 |
Expression patterns for significantly DE genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples. This is essentially a subset of the previously identified genes
Expression patterns for the next most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples, but which are not formally considered as DE
Although there were minimal DE genes in the above analysis, the similarity between differential expression values (i.e. logFC) was inspected visually.
Comparison between mutants showing logFC for each mutant, based on comparison against WT samples. Genes considered statistically significant between mutants are highlighted in red, however several other genes demonstrated either highly similar behaviour, or were suggestive of different behviours. These genes are labelled in grey. Dahed horizontal and vertical lines have been placed at \(\pm1\), with the unit line also shown in pale blue.
Expression patterns for genes showing similiarity of apparent differential expression across both mutants when compared to WT samples. In all cases one or more outlier points appears to have impacted the ability for these genes to be considered as DE. With the exception of the first two genes, these outliers samples were not consistent. Jitter has been added to the x-axis.
Expression patterns for genes showing potential differences in apparent differential expression across both mutants when compared to WT samples. In all cases one or more outlier points appears to have impacted the ability for these genes to be considered as DE. With the exception of the second and third genes in the top row, these outliers samples were not consistent. Jitter has been added to the x-axis.
As the level of transcript complexity is less in zebrafish than human, and 1:1 mapping between species is less robust, only a brief analysis was performed at the transcript level. In essence, the same genes were found as the most highly ranked, with changes in expression of psen2 transcripts detected as expected, providing a form of positive control. Following the top tables, the basic transcript expression patterns are shown for three possible genes of interest. Notably, the transcripts showing the strongest differential expression are expressed at very low-levels for both si:ch211-132g1.3 and slc37a4b.
| Transcript | Symbol | logFC | AveExpr | P.Value | FDR | gene_id |
|---|---|---|---|---|---|---|
| ENSDART00000187524 | CABZ01035279.1 | -8.715 | 0.2798 | 1.258e-07 | 0.003779 | ENSDARG00000116774 |
| ENSDART00000137332 | si:ch211-132g1.3 | -5.549 | -1.839 | 3.027e-07 | 0.004548 | ENSDARG00000089477 |
| psen2N140fs | psen2 | 3.63 | -4.436 | 2.665e-06 | 0.02669 | ENSDARG00000015540 |
| ENSDART00000114613 | ptcd1 | 0.8547 | 2.524 | 4.392e-06 | 0.03299 | ENSDARG00000076176 |
| ENSDART00000185608 | si:ch211-160d14.6 | -6.75 | -1.138 | 8.904e-06 | 0.05351 | ENSDARG00000115710 |
| ENSDART00000188158 | BX649405.1 | -1.041 | 2.066 | 2.743e-05 | 0.1374 | ENSDARG00000112605 |
| ENSDART00000127351 | atxn1l | -0.6731 | 2.866 | 4.159e-05 | 0.1738 | ENSDARG00000086977 |
| ENSDART00000006381 | psen2 | -0.9806 | 1.091 | 4.675e-05 | 0.1738 | ENSDARG00000015540 |
| ENSDART00000182716 | actb1 | -4.425 | -2.592 | 5.207e-05 | 0.1738 | ENSDARG00000113649 |
| ENSDART00000101586 | pcnp | 0.8028 | 2.719 | 7.057e-05 | 0.2121 | ENSDARG00000037713 |
| Transcript | Symbol | logFC | AveExpr | P.Value | FDR | gene_id |
|---|---|---|---|---|---|---|
| psen2T141_L142delinsMISLISV | psen2 | 5.931 | -3.349 | 1.13e-11 | 3.394e-07 | ENSDARG00000015540 |
| ENSDART00000006381 | psen2 | -0.9687 | 1.091 | 2.606e-05 | 0.3915 | ENSDARG00000015540 |
| ENSDART00000168762 | si:ch73-236c18.2 | 1.199 | 0.9262 | 4.543e-05 | 0.4551 | ENSDARG00000103829 |
| ENSDART00000078192 | cnpy4 | 0.9683 | 0.3566 | 9.046e-05 | 0.6795 | ENSDARG00000055797 |
| ENSDART00000133864 | gpr143 | -0.8098 | 0.5032 | 0.0001229 | 0.7386 | ENSDARG00000034572 |
| ENSDART00000168837 | fam168b | 1.738 | 2.394 | 0.0001914 | 0.866 | ENSDARG00000101733 |
| ENSDART00000185486 | BX890543.1 | -0.671 | 2.583 | 0.0002827 | 0.866 | ENSDARG00000114583 |
| ENSDART00000134826 | si:ch211-15j1.4 | 0.9712 | 4.025 | 0.0003188 | 0.866 | ENSDARG00000092604 |
| ENSDART00000027624 | stn1 | 0.4981 | 3.561 | 0.0003206 | 0.866 | ENSDARG00000007734 |
| ENSDART00000144157 | si:ch211-56a11.2 | 0.9227 | 1.733 | 0.0003439 | 0.866 | ENSDARG00000093677 |
| Transcript | Symbol | logFC | AveExpr | P.Value | FDR | gene_id |
|---|---|---|---|---|---|---|
| psen2T141_L142delinsMISLISV | psen2 | 5.913 | -3.349 | 2.255e-11 | 6.776e-07 | ENSDARG00000015540 |
| ENSDART00000187524 | CABZ01035279.1 | 7.696 | 0.2798 | 4.478e-07 | 0.006727 | ENSDARG00000116774 |
| ENSDART00000137332 | si:ch211-132g1.3 | 4.79 | -1.839 | 1.389e-06 | 0.01392 | ENSDARG00000089477 |
| psen2N140fs | psen2 | -3.635 | -4.436 | 2.056e-06 | 0.01545 | ENSDARG00000015540 |
| ENSDART00000150193 | slc37a4b | -1.158 | -0.06034 | 3.878e-06 | 0.02331 | ENSDARG00000077180 |
| ENSDART00000168762 | si:ch73-236c18.2 | 1.453 | 0.9262 | 9.67e-06 | 0.04843 | ENSDARG00000103829 |
| ENSDART00000141678 | si:ch211-114l13.3 | 1.966 | -0.9645 | 1.788e-05 | 0.07342 | ENSDARG00000094346 |
| ENSDART00000147678 | si:dkey-222h21.2 | 2.015 | 0.6033 | 1.955e-05 | 0.07342 | ENSDARG00000094297 |
| ENSDART00000188136 | CABZ01084501.2 | 0.6341 | 4.423 | 4.034e-05 | 0.1226 | ENSDARG00000113332 |
| ENSDART00000185608 | si:ch211-160d14.6 | 5.695 | -1.138 | 4.419e-05 | 0.1226 | ENSDARG00000115710 |
One transcript (ENSDART00000137332) was undetectable in the FS mutants, and this is a non-coding transcript.
| Â | seqnames | start | end | width | strand | tx_id | tx_biotype |
|---|---|---|---|---|---|---|---|
| ENSDART00000131675 | 1 | 1874427 | 1885594 | 11168 | - | ENSDART00000131675 | protein_coding |
| ENSDART00000165669 | 1 | 1888541 | 1894722 | 6182 | - | ENSDART00000165669 | protein_coding |
| ENSDART00000137332 | 1 | 1874008 | 1885600 | 11593 | - | ENSDART00000137332 | processed_transcript |
| ENSDART00000143790 | 1 | 1887620 | 1928122 | 40503 | - | ENSDART00000143790 | processed_transcript |
| ENSDART00000147773 | 1 | 1890171 | 1894005 | 3835 | - | ENSDART00000147773 | processed_transcript |
R version 3.5.2 (2018-12-20)
**Platform:** x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_AU.UTF-8, LC_NUMERIC=C, LC_TIME=en_AU.UTF-8, LC_COLLATE=en_AU.UTF-8, LC_MONETARY=en_AU.UTF-8, LC_MESSAGES=en_AU.UTF-8, LC_PAPER=en_AU.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_AU.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats4, parallel, stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: bindrcpp(v.0.2.2), ensembldb(v.2.6.5), AnnotationFilter(v.1.6.0), GenomicFeatures(v.1.34.3), AnnotationDbi(v.1.44.0), Biobase(v.2.42.0), GenomicRanges(v.1.34.0), GenomeInfoDb(v.1.18.1), IRanges(v.2.16.0), S4Vectors(v.0.20.1), pander(v.0.6.3), ggrepel(v.0.8.0), forcats(v.0.3.0), stringr(v.1.4.0), dplyr(v.0.7.8), purrr(v.0.3.0), readr(v.1.3.1), tidyr(v.0.8.2), tibble(v.2.0.1), ggplot2(v.3.1.0), tidyverse(v.1.2.1), scales(v.1.0.0), magrittr(v.1.5), AnnotationHub(v.2.14.3), BiocGenerics(v.0.28.0), tximport(v.1.10.1), edgeR(v.3.24.3) and limma(v.3.38.3)
loaded via a namespace (and not attached): colorspace(v.1.4-0), XVector(v.0.22.0), rstudioapi(v.0.9.0), bit64(v.0.9-7), interactiveDisplayBase(v.1.20.0), lubridate(v.1.7.4), xml2(v.1.2.0), knitr(v.1.21), jsonlite(v.1.6), Cairo(v.1.5-9), Rsamtools(v.1.34.1), broom(v.0.5.1), shiny(v.1.2.0), BiocManager(v.1.30.4), compiler(v.3.5.2), httr(v.1.4.0), backports(v.1.1.3), assertthat(v.0.2.0), Matrix(v.1.2-15), lazyeval(v.0.2.1), cli(v.1.0.1), later(v.0.8.0), htmltools(v.0.3.6), prettyunits(v.1.0.2), tools(v.3.5.2), gtable(v.0.2.0), glue(v.1.3.0), GenomeInfoDbData(v.1.2.0), Rcpp(v.1.0.0), cellranger(v.1.1.0), Biostrings(v.2.50.2), nlme(v.3.1-137), rtracklayer(v.1.42.1), crosstalk(v.1.0.0), xfun(v.0.4), rvest(v.0.3.2), mime(v.0.6), XML(v.3.98-1.17), zlibbioc(v.1.28.0), hms(v.0.4.2), promises(v.1.0.1), ProtGenerics(v.1.14.0), SummarizedExperiment(v.1.12.0), rhdf5(v.2.26.2), yaml(v.2.2.0), curl(v.3.3), memoise(v.1.1.0), biomaRt(v.2.38.0), stringi(v.1.2.4), RSQLite(v.2.1.1), highr(v.0.7), BiocParallel(v.1.16.6), rlang(v.0.3.1), pkgconfig(v.2.0.2), bitops(v.1.0-6), matrixStats(v.0.54.0), evaluate(v.0.13), lattice(v.0.20-38), Rhdf5lib(v.1.4.2), bindr(v.0.1.1), htmlwidgets(v.1.3), GenomicAlignments(v.1.18.1), labeling(v.0.3), bit(v.1.1-14), tidyselect(v.0.2.5), plyr(v.1.8.4), R6(v.2.3.0), generics(v.0.0.2), DelayedArray(v.0.8.0), DBI(v.1.0.0), pillar(v.1.3.1), haven(v.2.0.0), withr(v.2.1.2), RCurl(v.1.95-4.11), modelr(v.0.1.3), crayon(v.1.3.4), plotly(v.4.8.0), rmarkdown(v.1.11), progress(v.1.2.0), locfit(v.1.5-9.1), grid(v.3.5.2), readxl(v.1.2.0), data.table(v.1.12.0), blob(v.1.1.1), digest(v.0.6.18), xtable(v.1.8-3), httpuv(v.1.4.5.1), munsell(v.0.5.0) and viridisLite(v.0.3.0)